April 30, 2018

Overview of Work/Research

  • Segmentation/Classification of:
    • White Matter Lesions in Multiple Sclerosis
    • Brain vs. Skull (CT)
    • Brain Hemorrhage/Stroke (CT)
  • R Package Development/“Data Science”
  • Neuroimaging and R (Neuroconductor Project)

Overview of Work/Research

  • Segmentation/Classification of:
    • White Matter Lesions in Multiple Sclerosis
    • Brain vs. Skull (CT)
    • Brain Hemorrhage/Stroke (CT)
  • R Package Development
  • Neuroimaging and R (Neuroconductor Project)

Brain Image Processing in R

Workflow for an Analysis

  • bash flow
  • FSL flow
  • ANTs flow
  • MRIcroGL flow
  • OsiriX flow
  • SPM 12 flow
flow

Workflow for an Analysis

Multiple pieces of software used

  • all different syntax
flow

Goal:

Lower the bar to entry

  • all R code
    • pipeline tool
    • “native” R code

Complete pipeline

  • preprocessing and analysis
flow

Bioinformatics Repository: Bioconductor
flow

Bioinformatics Repository: Bioconductor
flow

  • centralized bioinformatics/genomics packages
  • large community/number of packages (> 1300)
  • published tutorials and workflows
  • additional requirements to CRAN (e.g. packages need vignettes)

flow
An R Platform for
Medical Imaging Analysis

Lesion Segmentation of MS

Public Dataset with Lesion Segmentation

  • “A novel public MR image dataset of multiple sclerosis patients with lesion segmentations based on multi-rater consensus” (Lesjak et al. 2018)

Demographic Data

  • On many different therapies (9 no therapy)
Variable Overall
n 30
Age (mean (sd)) 39.27 (10.12)
sex = M (%) 7 (23.3)
EDSS (mean (sd)) 2.61 (1.88)
Lesion_Volume (mean (sd)) 17.40 (16.13)
MS_Subtype (%)
Clinically Isolated Syndrome 2 (6.7)
Progressive-relapsing 1 (3.3)
Relapsing-remitting 24 (80.0)
Secondary-progressive 2 (6.7)
Unspecified 1 (3.3)

Imaging Data

  • 2D T1 (TR=2000ms, TE=20ms, TI=800ms) and after gadolinium
  • 2D T2 (TR=6000ms, TE=120ms), 3D FLAIR (TR=5000ms, TE=392ms, TI=1800 ms)
    • Fluid attenuated inversion recovery - reduce signal of fluids
  • All had flip angle of 120\(^{\circ}\)

OVERLAY

Terminology: Neuroimaging to Data/Statistics

  • Segmentation ⇔ classification
  • Image ⇔ 3-dimensional array
  • Mask/Region of Interest ⇔ binary (0/1) image
  • Registration ⇔ Spatial Normalization/Standarization
    • “Lining up” Brains

Image Representation: voxels (3D pixels)

Step 1: Create Predictors for each Sequence Preds

Data Structure for One Patient
Vox stack

Step 2: Aggregate Data

Training Data Structure

  • Stack together 14 randomly selected patients, stratified by age (over median) and volume
  • Train model/classifier on this design matrix
  • Test on 16 hold out
MISTIE LOGO

Step 3: Fit Models / Classifier

Let \(y_{i}(v)\) be the presence / absence of lesion for voxel \(v\) from person \(i\).

General model form: \[ P(Y_{i}(v) = 1) \propto f(X_{i}(v)) \]

Models Fit on the Training Data

  • OASIS logistic regression with images, and the interaction of the image with a 10mm\(^3\) and 20mm\(^3\) smoother (Sweeney et al. 2013). Does not include T1Post
    • With the original model from the paper and a re-trained model
  • Logistic Regression: \(f(X_{i}(v)) = \text{expit} \left\{ \beta_0 + \sum_{k= 1}^{p} x_{i, k}(v)\beta_{k}\right\}\)
  • Random Forests (Wright and Ziegler 2017), (Breiman 2001)
    • With and without the T1-Post for comparison to OASIS
      \(f(X_{i}(v)) \propto\) RF

Reseg

Dice Results (Triangle is population Dice) Reseg

RF Predicted Volume Estimates True Volume Reseg

OASIS: not so much Reseg

Patient with Median Overlap in Test Set

Median

Patient with Median Overlap in Test Set

Median

R Package

  • smri.process - on GitHub and Neuroconductor
    • relies on other Neuroconductor (not CRAN) packages

Conclusions of Stroke Analyses

  • We can segment ICH volume from CT scans

  • We can create population-level ICH distributions

  • Voxel-wise regression can show regions associated with severity

Conclusions of Stroke Analyses

  • We can segment ICH volume from CT scans
    • Incorporate variability of estimated volume
  • We can create population-level ICH distributions
    • Uncertainty measures of this
  • Voxel-wise regression can show regions associated with severity
    • Validate these regions (MISTIE III)
    • Scalar on image regression

Neuroimaging and R

Authored R Packages:

  • fslr

    (Muschelli, John, et al. “fslr: Connecting the FSL Software with R.” R JOURNAL 7.1 (2015): 163-175.)

  • brainR

    (Muschelli, John, Elizabeth Sweeney, and Ciprian Crainiceanu. “brainR: Interactive 3 and 4D Images of High Resolution Neuroimage Data.” R JOURNAL 6.1 (2014): 42-48.)

  • extrantsr
  • ichseg

    Muschelli, John, et al. “PItcHPERFeCT: Primary intracranial hemorrhage probability estimation using random forests on CT.” NeuroImage: Clinical 14 (2017): 379-390.

  • dcm2niir
  • matlabr
  • spm12r
  • itksnapr
  • papayar
  • WhiteStripe
  • oasis
  • SuBLIME
  • googleCite
  • diffr
  • rscopus
  • glassdoor

Number of Downloads (CRAN packages)

From the cranlogs R package:

Thank You

Breiman, Leo. 2001. “Random Forests.” Machine Learning 45 (1). Springer:5–32.

Lesjak, Žiga, Alfiia Galimzianova, Aleš Koren, Matej Lukin, Franjo Pernuš, Boštjan Likar, and Žiga Špiclin. 2018. “A Novel Public MR Image Dataset of Multiple Sclerosis Patients with Lesion Segmentations Based on Multi-Rater Consensus.” Neuroinformatics 16 (1). Springer:51–63.

Sweeney, Elizabeth M, Russell T Shinohara, Navid Shiee, Farrah J Mateen, Avni A Chudgar, Jennifer L Cuzzocreo, Peter A Calabresi, Dzung L Pham, Daniel S Reich, and Ciprian M Crainiceanu. 2013. “OASIS Is Automated Statistical Inference for Segmentation, with Applications to Multiple Sclerosis Lesion Segmentation in MRI.” NeuroImage: Clinical 2. Elsevier:402–13.

Wright, Marvin N., and Andreas Ziegler. 2017. “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software 77 (1):1–17. https://doi.org/10.18637/jss.v077.i01.